Deciding when to stop: Efficient stopping of active learning guided drug-target prediction

نویسندگان

  • Maja Temerinac-Ott
  • Armaghan W. Naik
  • Robert F. Murphy
چکیده

Active learning has shown to reduce the number of experiments needed to obtain high-confidence drug-target predictions. However, in order to actually save experiments using active learning, it is crucial to have a method to evaluate the quality of the current prediction and decide when to stop the experimentation process. Only by applying reliable stoping criteria to active learning, time and costs in the experimental process can be actually saved. We compute active learning traces on simulated drug-target matrices in order to learn a regression model for the accuracy of the active learner. By analyzing the performance of the regression model on simulated data, we design stopping criteria for previously unseen experimental matrices. We demonstrate on four previously characterized drug effect data sets that applying the stopping criteria can result in upto 40% savings of the total experiments for highly accurate predictions.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Multi-Criteria-Based Strategy to Stop Active Learning for Data Annotation

In this paper, we address the issue of deciding when to stop active learning for building a labeled training corpus. Firstly, this paper presents a new stopping criterion, classification-change, which considers the potential ability of each unlabeled example on changing decision boundaries. Secondly, a multi-criteriabased combination strategy is proposed to solve the problem of predefining an a...

متن کامل

Optimal stopping rules for active learning

Active learning algorithms aim to minimise the amount of labelled data used to learn a target concept. However, there is no formal framework for expressing the trade-off between needed accuracy and the cost of label acquisition, rendering the objective evaluation of algorithms problematic and the development of criteria for deciding when to terminate data acquisition impossible. This paper aims...

متن کامل

Active Learning for Word Sense Disambiguation with Methods for Addressing the Class Imbalance Problem

In this paper, we analyze the effect of resampling techniques, including undersampling and over-sampling used in active learning for word sense disambiguation (WSD). Experimental results show that under-sampling causes negative effects on active learning, but over-sampling is a relatively good choice. To alleviate the withinclass imbalance problem of over-sampling, we propose a bootstrap-based ...

متن کامل

Sample Efficient Policy Search for Optimal Stopping Domains

Optimal stopping problems consider the question of deciding when to stop an observation-generating process in order to maximize a return. We examine the problem of simultaneously learning and planning in such domains, when data is collected directly from the environment. We propose GFSE, a simple and flexible model-free policy search method that reuses data for sample efficiency by leveraging p...

متن کامل

The distorting effect of deciding to stop sampling

People usually collect information to serve specific goals and often end up with samples that are unrepresentative of the underlying population. This can introduce biases on later judgments that generalize from these samples. Here we show that goals influence not only what information we collect, but also when we decide to terminate search. Using an optimal stopping analysis, we demonstrate tha...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • CoRR

دوره abs/1504.02406  شماره 

صفحات  -

تاریخ انتشار 2015